Model Selection

Zero-shot video classification

# Zero-shot video classification

Xclip Large Patch14 Kinetics 600

X-CLIP is an extended version of CLIP for general video-language understanding, trained on video-text pairs through contrastive learning.

Transformers English

Xclip Base Patch16 Zero Shot

X-CLIP is a minimalist extension of CLIP for general video-language understanding, trained contrastively on (video, text) pairs, suitable for zero-shot, few-shot, or fully supervised video classification as well as video-text retrieval tasks.

Transformers English

Xclip Base Patch16

X-CLIP is an extended version of CLIP for general video-language understanding, trained via contrastive learning on (video, text) pairs, suitable for tasks like video classification and video-text retrieval.

Transformers English

Featured Recommended AI Models

AIbase

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご

© 2025AIbase